Description:
This dataset contains two sets of features, real estate related attributes, including 'increased price' and 'foreclosure' rate, and socioeconomic & demographic features. The real estate attributes have been gathered from zillow website and we have managed to get that data from 2010 to 2016. Having missing value in the original dataset, we used a linear interpolation to fully obtain the data for each year. It contains foreclosure rate data for 427 counties and increased price rate data for 717 counties. The second part, socioeconomic and demographic features, have been gathered from US census data 2010, and contains 1806 counties.
This dataset contains two sets of features, real estate related attributes, including 'increased price' and 'foreclosure' rate, and socioeconomic & demographic features. The real estate attributes have been gathered from zillow website and we have managed to get that data from 2010 to 2016. Having missing value in the original dataset, we used a linear interpolation to fully obtain the data for each year. It contains foreclosure rate data for 427 counties and increased price rate data for 717 counties. The second part, socioeconomic and demographic features, have been gathered from US census data 2010, and contains 1806 counties.
Attribute Information:
- cnty: discrete / county's fips code
- ip2011: countinuous / Increased price rate in 2011
- ip2012: countinuous / Increased price rate in 2012
- ip2013: countinuous / Increased price rate in 2013
- saf2011: countinuous / Foreclosure rate in 2011
- saf2012: countinuous / Foreclosure rate in 2012
- saf2013: countinuous / Foreclosure rate in 2013
- hsgradHC03_VC93ACS3yr$10: continuous / High school graduate rate in 2010
- bachdegHC03_VC94ACS3yr$10: continuous / Bachelore degree rate in 2010
- logincomeHC01_VC85ACS3yr$10: continuous / Log income in 2010
- unemployAve_BLSLAUS$0910: continuous / Unemployment rate in 2010
- femalePOP165210D$10: continuous / Female population rate in 2010
- hispanicPOP405210D$10: continuous / Hispanic population rate in 2010
- blackPOP255210D$10: continuous / Black population rate in 2010
- forgnbornHC03_VC134ACS3yr$10: continuous / Foreign born population rate in 2010
- county_density: continuous / county's density in 2010
- age_lt20: continuous / Percentage of people with age less than 20 in 2010
- age_20to39: continuous / Percentage of people with age between 20 to 39 in 2010
Citations:
- 'Using Twitter language to predict the Real Estate market', EACL 2017, Mohammadzaman Zamani and Hansen Andrew Schwartz. bibtex
- 'Zillow datasets', zillow website 2016, ”http://www.zillow.com/research/data/", [Accessed: 2016-11-10]. bibtex
- 'Profile of general population and housing characteristics:2010 demographic profile data', US census bureau. 2010, "https://factfinder.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=DEC_10_DP_DPDP1&prodType=table". bibtex